Listen Top Shows Blog

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

Update: 2025-05-23

Share

Description

In an otherwise heavy week packed with Microsoft Build, Google I/O, and OpenAI io, the worst kept secret in biglab land was the launch of Claude 4, particularly the triumphant return of Opus, which many had been clamoring for. We will leave the specific Claude 4 recap to AINews, however we think that both Gemini’s progress on Deep Think this week and Claude 4 represent the next frontier of progress on inference time compute/reasoning (at last until GPT5 ships this summer).

Will Brown’s talk at AIE NYC and open source work on verifiers have made him one of the most prominent voices able to publicly discuss (aka without the vaguepoasting LoRA they put on you when you join a biglab) the current state of the art in reasoning models and where current SOTA research directions lead. We discussed his latest paper on Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment and he has previewed his AIEWF talk on Agentic RL for those with the temerity to power thru bad meetup audio.

Comments

In Channel

Better Data is All You Need — Ari Morcos, Datology

Better Data is All You Need — Ari Morcos, Datology

2025-08-2901:18:42

Long Live Context Engineering - with Jeff Huber of Chroma

Long Live Context Engineering - with Jeff Huber of Chroma

2025-08-1957:00

Greg Brockman on OpenAI's Road to AGI

Greg Brockman on OpenAI's Road to AGI

2025-08-1501:08:36

The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)

The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)

2025-07-3101:18:59

AI is Eating Search

AI is Eating Search

2025-07-2356:21

Cline: the open source coding agent that doesn't cut costs

Cline: the open source coding agent that doesn't cut costs

2025-07-1601:15:43

Personalized AI Language Education — with Andrew Hsu, Speak

Personalized AI Language Education — with Andrew Hsu, Speak

2025-07-1101:04:09

AI Video Is Eating The World — Olivia and Justine Moore, a16z

AI Video Is Eating The World — Olivia and Justine Moore, a16z

2025-07-0949:27

Information Theory for Language Models: Jack Morris

Information Theory for Language Models: Jack Morris

2025-07-0201:18:13

Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI

Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI

2025-06-1901:17:46

The Shape of Compute (Chris Lattner of Modular)

The Shape of Compute (Chris Lattner of Modular)

2025-06-1301:18:17

The Utility of Interpretability — Emmanuel Amiesen

The Utility of Interpretability — Emmanuel Amiesen

2025-06-0601:53:01

[AIEWF Preview] Containing Agent Chaos — Solomon Hykes

[AIEWF Preview] Containing Agent Chaos — Solomon Hykes

2025-06-0327:13

[AIEWF Preview] CloudChef: Your Robot Chef - Michellin-Star food at $12/hr (w/ Kitchen tour!)

[AIEWF Preview] CloudChef: Your Robot Chef - Michellin-Star food at $12/hr (w/ Kitchen tour!)

2025-05-3120:49

The AI Coding Factory

The AI Coding Factory

2025-05-2959:22

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

2025-05-2339:57

ChatGPT Codex: The Missing Manual

ChatGPT Codex: The Missing Manual

2025-05-1653:31

Claude Code: Anthropic's CLI Agent

Claude Code: Anthropic's CLI Agent

2025-05-0701:17:21

⚡️The Rise and Fall of the Vector DB Category

⚡️The Rise and Fall of the Vector DB Category

2025-05-0127:16

Why Every Agent needs Open Source Cloud Sandboxes

Why Every Agent needs Open Source Cloud Sandboxes

2025-04-2401:06:38

00:00

00:00

x

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

swyx + Alessio